To estimate home ranges with Kernel Density Estimates.
Packages:
library(ggplot2)
library(plyr)
library(sf)
library(adehabitatHR)
library(mapview)
load("./data/elk_processed.rda")
head(elk_gps)
## datetime lon lat id
## 1 2001-12-13 07:01:00 -115.8043 52.12410 4049
## 2 2001-12-13 09:01:00 -115.8003 52.11762 4049
## 3 2001-12-14 09:01:00 -115.8281 52.09611 4049
## 4 2001-12-14 11:00:00 -115.8318 52.09829 4049
## 5 2001-12-14 17:02:00 -115.8042 52.09482 4049
## 6 2001-12-14 19:01:00 -115.8037 52.12493 4049
Same 3 individuals as in the MCP lab:
elk_res <- elk_gps |>
subset(id %in% c("YL80", "YL91", "YL94")) |>
mutate(id = droplevels(id))
elk_res_sf <- elk_res |>
st_as_sf(coords = c("lon","lat"), crs = 4326) |>
st_transform(32611)
elk_res <- cbind(elk_res, st_coordinates(elk_res_sf))
Kernel density estimation (KDE) is a slightly more complex method
used to calculate homerange areas, also using the
adehabitatHR package. As defined by this package, it
essentially defines the KDE homerange as the “minimum area in which
an animal has some specified probability of being located”.
KDE applies a user-specified function to input points/locations in order to predict the probability of occurrence or the “utilization distribution”, within each pixel of a user-specified grid. The choice of resolution for this grid can have a trade-off, with a finer resolution being more memory-intensive to calculate. The choice of this resolution should be informed by the resolution of the input data.
A commonly used function for KDE (and the default for the
adehabitatHR package function, kernelUD) is the “bivariate
normal kernel”. This function takes a set of individual observations, a
parameter for the number of observations, and importantly, a
user-specified parameter for the smoothing factor, h (also
called the “bandwidth”). The choice of the value for this h
parameter can significantly impact results, with a larger h
resulting in more smoothing or a larger distance over which a data point
influences the utlization distribution. The adehabitatHR package uses
the “reference bandwidth” as a default with its kernelUD
function but additional options exist (e.g., the “least-square cross
validation method” or a user-chosen value).
Users can also specify the percentage of density to use for homerange
estimation (usually 50-95%), within the getverticeshr
function, which calculates the homerange area using the estimated
utilization distribution from the kernelUD function.
Similar to the mcp function, the kernelUD
and getverticeshr functions are applied to one individual
at a time. To apply to all individuals at once, we can write our own
function. The function takes an individual sf object, a user-specified
grid, and a user-specified percent. It then converts the data into
“SpatialPoints” format and applies the kernelUD and
getverticeshr to get the estimated homerange contour. The
contour is then converted back to an sf object (as a POLYGON) and the
area of the polygon is calculated using the st_area. This
area is added as a column to our output homerange polygon object.
We’ll do it here for one animal:
myelk_sf <- elk_res_sf |> subset(id == "YL80")
myelk_sp <- myelk_sf |> as_Spatial()
myelk_kernelud <- kernelUD(myelk_sp, grid = 200)
This object estimates a whole density function, which looks like this:
plot(myelk_kernelud)
You can convert this to a raster, which is spatially
referenced:
require(raster)
my_kernel <- raster(myelk_kernelud)
my_kernel
## class : RasterLayer
## dimensions : 94, 200, 18800 (nrow, ncol, ncell)
## resolution : 423.9829, 423.9829 (x, y)
## extent : 554679.8, 639476.4, 5710726, 5750580 (xmin, xmax, ymin, ymax)
## crs : +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs
## source : memory
## names : ud
## values : 0, 5.957969e-08 (min, max)
plot(my_kernel)
And you can even use this in a mapview:
mapview(my_kernel)
Contours can be helpful:
plot(my_kernel)
contour(my_kernel, add = TRUE)
Or, for fun, a 3D plot:
persp(my_kernel, border = NA, shade = TRUE)
From this, you can calculate a few metrics. For example, if you want the border of a 95% kernel (somewhat similar to what the MCP returns):
myelk_kde_poly <- getverticeshr(myelk_kernelud, percent = 95) |>
st_as_sf()
myelk_kde_poly
## Simple feature collection with 1 feature and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 591675.7 ymin: 5723552 xmax: 609537.9 ymax: 5737419
## Projected CRS: +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs
## id area geometry
## homerange homerange 7617.674 MULTIPOLYGON (((607520.6 57...
Note, that you can easily obtain the area of this polygon
st_area(myelk_kde_poly)
## 76176742 [m^2]
And map this:
mapview(myelk_kde_poly) + mapview(myelk_sf)
We can also write a function that obtains all this information:
getKernelPoly <- function(sf, percent = 95, ...){
sp <- sf |> mutate(id = droplevels(id))
as_Spatial(sp[,"id"], cast = TRUE, IDs = "id") |> kernelUD(...) |>
getverticeshr(percent = 95) |>
st_as_sf()
}
You can use this to compare different parameters, like kernel choice
(remember the differenve between the bivnorm and
epa kernels):
kde_poly_norm <- getKernelPoly(elk_sf |> subset(id == "YL91"), kern = "bivnorm")
kde_poly_epa <- getKernelPoly(elk_sf |> subset(id == "YL91"), kern = "epa")
kde_compare_kernels <- rbind(
kde_poly_norm |> mutate(type = "Bivariate Normal"),
kde_poly_epa |> mutate(type = "Epanechnikov"))
ggplot(kde_compare_kernels) + geom_sf(aes(fill = type), alpha = .5) +
geom_sf(data = elk_sf |> subset(id == "YL91"), alpha = .2, size = 1)
Or … you cam compute all of the kerners for a bunch of individuals(!)
MCP_allElks <- getKernelPoly(elk_sf |> st_transform(32611), kern = "epa")
All the MCP’s plotted:
ggplot(MCP_allElks) +
geom_sf(aes(fill = id, color = id), alpha = .2)
Now we can get some real statistics:
MCP_allElks
## Simple feature collection with 31 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 535937.6 ymin: 5692262 xmax: 616298.8 ymax: 5782179
## Projected CRS: +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs
## First 10 features:
## id area geometry
## 4049 4049 60414.1689 MULTIPOLYGON (((596325.6 57...
## GP1 GP1 372.4074 MULTIPOLYGON (((603295.7 57...
## GP2 GP2 31352.2432 MULTIPOLYGON (((588359.7 57...
## GR104 GR104 12308.3879 MULTIPOLYGON (((593498.8 57...
## GR182 GR182 12931.9182 MULTIPOLYGON (((587093.1 57...
## GR193 GR193 90500.7610 MULTIPOLYGON (((538593.3 57...
## GR196 GR196 13729.4068 MULTIPOLYGON (((571556.2 57...
## YL15 YL15 4676.0568 MULTIPOLYGON (((602577.2 57...
## YL2 YL2 17027.3299 MULTIPOLYGON (((576949.1 57...
## YL25 YL25 31357.1322 MULTIPOLYGON (((549365.4 57...
summary(MCP_allElks$area)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 372.4 6645.3 14057.2 23995.6 31354.7 90500.8
Amazing variation in the kernel estimates!
Fieberg and Kochanny 2010 argue convincingly that if you are esimating overlap between multiple organisms, you should leverage the power of the whole kernel density.
Let’s get two kernel densities:
elk1 <- elk_res_sf |> subset(id == "YL80")
elk2 <- elk_res_sf |> subset(id == "YL94")
kernel1 <- elk1 |> as_Spatial() |> kernelUD(grid = 200)
kernel2 <- elk2 |> as_Spatial() |> kernelUD(grid = 200)
Plotting both kernels as contours:
contour(kernel1,
xlim = c(590e3, 610e3), ylim = c(5720e3, 5745e3),
col = "blue")
contour(kernel2, add = TRUE, col = "red")
axis(1); axis(2)
To compute overlap, you just have to convert your sf
list of objects into a SpatialPointsDataFrame - but -
importantly, identify id as the ID column.
The VI is the “volume overlap” measure.
threeElks_sp <- as_Spatial(elk_res_sf[,"id"], cast = TRUE, IDs = "id")
kerneloverlap(threeElks_sp, method = "VI", kern = "epa")
## YL80 YL91 YL94
## YL80 1.0154700 0.6775542 0.6936280
## YL91 0.6775542 1.0066410 0.5842290
## YL94 0.6936280 0.5842290 0.9959909
Note how high that overlap is!